Overview

Dataset statistics

Number of variables25
Number of observations10000
Missing cells311
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory200.0 B

Variable types

Numeric14
DateTime2
Categorical9

Warnings

AVERAGE_ACCT_AGE has a high cardinality: 112 distinct values High cardinality
CREDIT_HISTORY_LENGTH has a high cardinality: 192 distinct values High cardinality
DISBURSED_AMOUNT is highly correlated with ASSET_COSTHigh correlation
ASSET_COST is highly correlated with DISBURSED_AMOUNTHigh correlation
AADHAR_FLAG is highly correlated with VOTERID_FLAGHigh correlation
VOTERID_FLAG is highly correlated with AADHAR_FLAGHigh correlation
DISBURSED_AMOUNT is highly correlated with ASSET_COSTHigh correlation
ASSET_COST is highly correlated with DISBURSED_AMOUNTHigh correlation
AADHAR_FLAG is highly correlated with VOTERID_FLAGHigh correlation
VOTERID_FLAG is highly correlated with AADHAR_FLAGHigh correlation
df_index is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
DISBURSED_AMOUNT is highly correlated with ASSET_COST and 8 other fieldsHigh correlation
ASSET_COST is highly correlated with DISBURSED_AMOUNT and 8 other fieldsHigh correlation
LTV is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
BRANCH_ID is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
SUPPLIER_ID is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
MANUFACTURER_ID is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
CURRENT_PINCODE_ID is highly correlated with AADHAR_FLAG and 6 other fieldsHigh correlation
STATE_ID is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
EMPLOYEE_CODE_ID is highly correlated with AADHAR_FLAG and 7 other fieldsHigh correlation
AADHAR_FLAG is highly correlated with df_index and 16 other fieldsHigh correlation
PAN_FLAG is highly correlated with df_index and 16 other fieldsHigh correlation
VOTERID_FLAG is highly correlated with df_index and 16 other fieldsHigh correlation
PASSPORT_FLAG is highly correlated with df_index and 16 other fieldsHigh correlation
PERFORM_CNS_SCORE is highly correlated with PRI_OVERDUE_ACCTS and 2 other fieldsHigh correlation
PRI_OVERDUE_ACCTS is highly correlated with df_index and 16 other fieldsHigh correlation
NEW_ACCTS_IN_LAST_SIX_MONTHS is highly correlated with df_index and 16 other fieldsHigh correlation
NO_OF_INQUIRIES is highly correlated with df_index and 16 other fieldsHigh correlation
LOAN_DEFAULT is highly correlated with df_index and 14 other fieldsHigh correlation
BRANCH_ID is highly correlated with CURRENT_PINCODE_ID and 1 other fieldsHigh correlation
CURRENT_PINCODE_ID is highly correlated with BRANCH_ID and 1 other fieldsHigh correlation
LOAN_DEFAULT is highly correlated with EMPLOYMENT_TYPE and 1 other fieldsHigh correlation
EMPLOYMENT_TYPE is highly correlated with LOAN_DEFAULTHigh correlation
AADHAR_FLAG is highly correlated with VOTERID_FLAG and 1 other fieldsHigh correlation
DISBURSED_AMOUNT is highly correlated with LTV and 2 other fieldsHigh correlation
LTV is highly correlated with DISBURSED_AMOUNTHigh correlation
MANUFACTURER_ID is highly correlated with DISBURSED_AMOUNT and 1 other fieldsHigh correlation
ASSET_COST is highly correlated with DISBURSED_AMOUNT and 1 other fieldsHigh correlation
PERFORM_CNS_SCORE_DESCRIPTION is highly correlated with PERFORM_CNS_SCOREHigh correlation
PERFORM_CNS_SCORE is highly correlated with LOAN_DEFAULT and 1 other fieldsHigh correlation
VOTERID_FLAG is highly correlated with AADHAR_FLAG and 1 other fieldsHigh correlation
STATE_ID is highly correlated with BRANCH_ID and 3 other fieldsHigh correlation
AADHAR_FLAG is highly correlated with VOTERID_FLAGHigh correlation
VOTERID_FLAG is highly correlated with AADHAR_FLAGHigh correlation
EMPLOYMENT_TYPE has 311 (3.1%) missing values Missing
df_index has unique values Unique
PERFORM_CNS_SCORE has 4911 (49.1%) zeros Zeros
PRI_OVERDUE_ACCTS has 8846 (88.5%) zeros Zeros
NEW_ACCTS_IN_LAST_SIX_MONTHS has 7755 (77.5%) zeros Zeros
NO_OF_INQUIRIES has 8673 (86.7%) zeros Zeros

Reproduction

Analysis started2021-08-02 18:56:16.288063
Analysis finished2021-08-02 18:56:52.555009
Duration36.27 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67635.9967
Minimum4
Maximum133146
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:52.633798image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile7296.55
Q134280.25
median68325.5
Q3101181
95-th percentile126785.15
Maximum133146
Range133142
Interquartile range (IQR)66900.75

Descriptive statistics

Standard deviation38516.74744
Coefficient of variation (CV)0.5694711296
Kurtosis-1.207124721
Mean67635.9967
Median Absolute Deviation (MAD)33552
Skewness-0.02718809912
Sum676359967
Variance1483539833
MonotonicityNot monotonic
2021-08-02T12:56:52.754476image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
412251
 
< 0.1%
1115141
 
< 0.1%
392291
 
< 0.1%
350831
 
< 0.1%
661901
 
< 0.1%
392921
 
< 0.1%
1235571
 
< 0.1%
820221
 
< 0.1%
756351
 
< 0.1%
960551
 
< 0.1%
Other values (9990)9990
99.9%
ValueCountFrequency (%)
41
< 0.1%
71
< 0.1%
91
< 0.1%
171
< 0.1%
291
< 0.1%
431
< 0.1%
501
< 0.1%
551
< 0.1%
601
< 0.1%
611
< 0.1%
ValueCountFrequency (%)
1331461
< 0.1%
1331381
< 0.1%
1331181
< 0.1%
1330951
< 0.1%
1330851
< 0.1%
1330431
< 0.1%
1330361
< 0.1%
1330151
< 0.1%
1330031
< 0.1%
1330021
< 0.1%

DISBURSED_AMOUNT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3410
Distinct (%)34.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54557.3842
Minimum13664
Maximum191392
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:52.869169image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum13664
5-th percentile35139
Q147349
median53908
Q360613.25
95-th percentile74251
Maximum191392
Range177728
Interquartile range (IQR)13264.25

Descriptive statistics

Standard deviation12546.24397
Coefficient of variation (CV)0.2299641773
Kurtosis4.63831939
Mean54557.3842
Median Absolute Deviation (MAD)6559
Skewness0.9504849324
Sum545573842
Variance157408237.8
MonotonicityNot monotonic
2021-08-02T12:56:52.982865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5330399
 
1.0%
5230399
 
1.0%
4834998
 
1.0%
4734994
 
0.9%
5030389
 
0.9%
5130387
 
0.9%
5625983
 
0.8%
5525977
 
0.8%
4634974
 
0.7%
5725968
 
0.7%
Other values (3400)9132
91.3%
ValueCountFrequency (%)
136641
< 0.1%
141641
< 0.1%
144551
< 0.1%
151561
< 0.1%
155401
< 0.1%
159101
< 0.1%
167001
< 0.1%
170601
< 0.1%
171191
< 0.1%
180401
< 0.1%
ValueCountFrequency (%)
1913921
 
< 0.1%
1670181
 
< 0.1%
1342651
 
< 0.1%
1315091
 
< 0.1%
1294851
 
< 0.1%
1273281
 
< 0.1%
1271031
 
< 0.1%
1207063
< 0.1%
1203121
 
< 0.1%
1198071
 
< 0.1%

ASSET_COST
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7311
Distinct (%)73.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76110.3872
Minimum38055
Maximum255315
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:53.115510image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum38055
5-th percentile58684.55
Q165774
median71066
Q379693
95-th percentile109874.2
Maximum255315
Range217260
Interquartile range (IQR)13919

Descriptive statistics

Standard deviation18376.77053
Coefficient of variation (CV)0.2414489165
Kurtosis9.95686984
Mean76110.3872
Median Absolute Deviation (MAD)6466
Skewness2.462493213
Sum761103872
Variance337705695
MonotonicityNot monotonic
2021-08-02T12:56:53.434690image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6800030
 
0.3%
6900023
 
0.2%
7300022
 
0.2%
7800022
 
0.2%
7500021
 
0.2%
6600020
 
0.2%
7200020
 
0.2%
6218820
 
0.2%
7700019
 
0.2%
7100019
 
0.2%
Other values (7301)9784
97.8%
ValueCountFrequency (%)
380551
< 0.1%
394591
< 0.1%
396201
< 0.1%
401671
< 0.1%
406681
< 0.1%
406801
< 0.1%
407682
< 0.1%
409861
< 0.1%
410001
< 0.1%
410641
< 0.1%
ValueCountFrequency (%)
2553151
< 0.1%
2541771
< 0.1%
2517671
< 0.1%
2411341
< 0.1%
1999511
< 0.1%
1993711
< 0.1%
1964511
< 0.1%
1891301
< 0.1%
1835451
< 0.1%
1827192
< 0.1%

LTV
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct3533
Distinct (%)35.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.76375
Minimum19.3
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:53.565307image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum19.3
5-th percentile52.48
Q168.85
median76.715
Q383.555
95-th percentile89.35
Maximum95
Range75.7
Interquartile range (IQR)14.705

Descriptive statistics

Standard deviation11.3507564
Coefficient of variation (CV)0.1518216569
Kurtosis1.343324659
Mean74.76375
Median Absolute Deviation (MAD)7.215
Skewness-1.068962027
Sum747637.5
Variance128.8396709
MonotonicityNot monotonic
2021-08-02T12:56:53.687979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85167
 
1.7%
84.9945
 
0.4%
79.9926
 
0.3%
79.9221
 
0.2%
7519
 
0.2%
89.9619
 
0.2%
8019
 
0.2%
74.9318
 
0.2%
79.7918
 
0.2%
74.9117
 
0.2%
Other values (3523)9631
96.3%
ValueCountFrequency (%)
19.31
< 0.1%
21.121
< 0.1%
21.641
< 0.1%
22.341
< 0.1%
22.41
< 0.1%
22.451
< 0.1%
22.831
< 0.1%
24.51
< 0.1%
25.351
< 0.1%
25.421
< 0.1%
ValueCountFrequency (%)
951
< 0.1%
94.991
< 0.1%
94.981
< 0.1%
94.942
< 0.1%
94.921
< 0.1%
94.91
< 0.1%
94.861
< 0.1%
94.851
< 0.1%
94.831
< 0.1%
94.781
< 0.1%

BRANCH_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct82
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.2591
Minimum1
Maximum261
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:53.809654image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q113
median61
Q3121
95-th percentile248
Maximum261
Range260
Interquartile range (IQR)108

Descriptive statistics

Standard deviation69.39148712
Coefficient of variation (CV)0.9737912368
Kurtosis0.3631914529
Mean71.2591
Median Absolute Deviation (MAD)50
Skewness1.054904565
Sum712591
Variance4815.178485
MonotonicityNot monotonic
2021-08-02T12:56:53.930331image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2590
 
5.9%
67456
 
4.6%
5405
 
4.0%
3389
 
3.9%
36384
 
3.8%
34361
 
3.6%
136345
 
3.5%
1302
 
3.0%
16295
 
2.9%
19250
 
2.5%
Other values (72)6223
62.2%
ValueCountFrequency (%)
1302
3.0%
2590
5.9%
3389
3.9%
5405
4.0%
7144
 
1.4%
8145
 
1.5%
9103
 
1.0%
10178
 
1.8%
11181
 
1.8%
13132
 
1.3%
ValueCountFrequency (%)
2617
 
0.1%
26018
 
0.2%
25918
 
0.2%
2589
 
0.1%
25744
 
0.4%
25561
 
0.6%
25472
0.7%
251166
1.7%
25054
 
0.5%
24943
 
0.4%

SUPPLIER_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1968
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19683.0378
Minimum10524
Maximum24789
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:54.051008image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum10524
5-th percentile14197.55
Q116565
median20470
Q323000
95-th percentile24151.1
Maximum24789
Range14265
Interquartile range (IQR)6435

Descriptive statistics

Standard deviation3484.60368
Coefficient of variation (CV)0.1770358679
Kurtosis-1.459941397
Mean19683.0378
Median Absolute Deviation (MAD)2981
Skewness-0.1911875221
Sum196830378
Variance12142462.81
MonotonicityNot monotonic
2021-08-02T12:56:54.172683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1569463
 
0.6%
1831762
 
0.6%
1566361
 
0.6%
1423457
 
0.6%
1816654
 
0.5%
2112453
 
0.5%
1798052
 
0.5%
1437549
 
0.5%
1434747
 
0.5%
1411546
 
0.5%
Other values (1958)9456
94.6%
ValueCountFrequency (%)
105241
 
< 0.1%
123122
 
< 0.1%
123745
0.1%
124412
 
< 0.1%
124567
0.1%
125002
 
< 0.1%
125342
 
< 0.1%
127972
 
< 0.1%
128422
 
< 0.1%
128781
 
< 0.1%
ValueCountFrequency (%)
247891
 
< 0.1%
247601
 
< 0.1%
247541
 
< 0.1%
247531
 
< 0.1%
247452
< 0.1%
247324
< 0.1%
247291
 
< 0.1%
247281
 
< 0.1%
247192
< 0.1%
247161
 
< 0.1%

MANUFACTURER_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.3938
Minimum45
Maximum145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:54.280399image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum45
5-th percentile45
Q148
median86
Q386
95-th percentile86
Maximum145
Range100
Interquartile range (IQR)38

Descriptive statistics

Standard deviation22.17749097
Coefficient of variation (CV)0.3195889398
Kurtosis-0.6923296466
Mean69.3938
Median Absolute Deviation (MAD)34
Skewness0.3737814211
Sum693938
Variance491.8411057
MonotonicityNot monotonic
2021-08-02T12:56:54.361179image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
864768
47.7%
452374
23.7%
511157
 
11.6%
48715
 
7.1%
49421
 
4.2%
120419
 
4.2%
67107
 
1.1%
14539
 
0.4%
ValueCountFrequency (%)
452374
23.7%
48715
 
7.1%
49421
 
4.2%
511157
 
11.6%
67107
 
1.1%
864768
47.7%
120419
 
4.2%
14539
 
0.4%
ValueCountFrequency (%)
14539
 
0.4%
120419
 
4.2%
864768
47.7%
67107
 
1.1%
511157
 
11.6%
49421
 
4.2%
48715
 
7.1%
452374
23.7%

CURRENT_PINCODE_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct3149
Distinct (%)31.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3415.9663
Minimum1
Maximum7345
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:54.459915image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile272
Q11515
median2989
Q35666
95-th percentile6944
Maximum7345
Range7344
Interquartile range (IQR)4151

Descriptive statistics

Standard deviation2233.76098
Coefficient of variation (CV)0.6539177452
Kurtosis-1.283278229
Mean3415.9663
Median Absolute Deviation (MAD)1931.5
Skewness0.2637899003
Sum34159663
Variance4989688.118
MonotonicityNot monotonic
2021-08-02T12:56:54.574608image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
257881
 
0.8%
144671
 
0.7%
151551
 
0.5%
298939
 
0.4%
294339
 
0.4%
278237
 
0.4%
300034
 
0.3%
179433
 
0.3%
151130
 
0.3%
237829
 
0.3%
Other values (3139)9556
95.6%
ValueCountFrequency (%)
11
 
< 0.1%
22
 
< 0.1%
42
 
< 0.1%
510
0.1%
67
0.1%
710
0.1%
92
 
< 0.1%
122
 
< 0.1%
133
 
< 0.1%
146
0.1%
ValueCountFrequency (%)
73451
 
< 0.1%
73361
 
< 0.1%
73331
 
< 0.1%
73252
< 0.1%
73241
 
< 0.1%
73222
< 0.1%
73214
< 0.1%
73131
 
< 0.1%
73081
 
< 0.1%
73061
 
< 0.1%
Distinct5317
Distinct (%)53.2%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Minimum1954-07-17 00:00:00
Maximum2000-09-21 00:00:00
2021-08-02T12:56:54.694288image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:54.811973image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

EMPLOYMENT_TYPE
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing311
Missing (%)3.1%
Memory size78.2 KiB
Self employed
5523 
Salaried
4166 

Length

Max length13
Median length13
Mean length10.85013933
Min length8

Characters and Unicode

Total characters105127
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSalaried
2nd rowSalaried
3rd rowSelf employed
4th rowSelf employed
5th rowSelf employed

Common Values

ValueCountFrequency (%)
Self employed5523
55.2%
Salaried4166
41.7%
(Missing)311
 
3.1%

Length

2021-08-02T12:56:55.009445image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:55.066293image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
self5523
36.3%
employed5523
36.3%
salaried4166
27.4%

Most occurring characters

ValueCountFrequency (%)
e20735
19.7%
l15212
14.5%
S9689
9.2%
d9689
9.2%
a8332
7.9%
f5523
 
5.3%
5523
 
5.3%
m5523
 
5.3%
p5523
 
5.3%
o5523
 
5.3%
Other values (3)13855
13.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter89915
85.5%
Uppercase Letter9689
 
9.2%
Space Separator5523
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e20735
23.1%
l15212
16.9%
d9689
10.8%
a8332
9.3%
f5523
 
6.1%
m5523
 
6.1%
p5523
 
6.1%
o5523
 
6.1%
y5523
 
6.1%
r4166
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
S9689
100.0%
Space Separator
ValueCountFrequency (%)
5523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin99604
94.7%
Common5523
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e20735
20.8%
l15212
15.3%
S9689
9.7%
d9689
9.7%
a8332
8.4%
f5523
 
5.5%
m5523
 
5.5%
p5523
 
5.5%
o5523
 
5.5%
y5523
 
5.5%
Other values (2)8332
8.4%
Common
ValueCountFrequency (%)
5523
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII105127
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e20735
19.7%
l15212
14.5%
S9689
9.2%
d9689
9.2%
a8332
7.9%
f5523
 
5.3%
5523
 
5.3%
m5523
 
5.3%
p5523
 
5.3%
o5523
 
5.3%
Other values (3)13855
13.2%
Distinct84
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Minimum2018-08-01 00:00:00
Maximum2018-10-31 00:00:00
2021-08-02T12:56:55.143088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:55.263765image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

STATE_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.2583
Minimum1
Maximum22
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:55.381489image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median6
Q310
95-th percentile16
Maximum22
Range21
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.456340091
Coefficient of variation (CV)0.613964715
Kurtosis-0.3567451897
Mean7.2583
Median Absolute Deviation (MAD)3
Skewness0.8061249074
Sum72583
Variance19.85896701
MonotonicityNot monotonic
2021-08-02T12:56:55.471238image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
41946
19.5%
31494
14.9%
61373
13.7%
13776
 
7.8%
9718
 
7.2%
8625
 
6.2%
14414
 
4.1%
5406
 
4.1%
1380
 
3.8%
7313
 
3.1%
Other values (12)1555
15.6%
ValueCountFrequency (%)
1380
 
3.8%
2160
 
1.6%
31494
14.9%
41946
19.5%
5406
 
4.1%
61373
13.7%
7313
 
3.1%
8625
 
6.2%
9718
 
7.2%
10144
 
1.4%
ValueCountFrequency (%)
225
 
0.1%
211
 
< 0.1%
204
 
< 0.1%
1942
 
0.4%
18234
 
2.3%
17160
 
1.6%
16108
 
1.1%
15212
 
2.1%
14414
4.1%
13776
7.8%

EMPLOYEE_CODE_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2629
Distinct (%)26.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1548.3745
Minimum1
Maximum3783
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:55.572940image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile152.95
Q1707
median1441.5
Q32369
95-th percentile3188.05
Maximum3783
Range3782
Interquartile range (IQR)1662

Descriptive statistics

Standard deviation978.2866477
Coefficient of variation (CV)0.6318152667
Kurtosis-1.062187128
Mean1548.3745
Median Absolute Deviation (MAD)816.5
Skewness0.2470667032
Sum15483745
Variance957044.7651
MonotonicityNot monotonic
2021-08-02T12:56:55.691648image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
254630
 
0.3%
62027
 
0.3%
25520
 
0.2%
13020
 
0.2%
18417
 
0.2%
90817
 
0.2%
313217
 
0.2%
215317
 
0.2%
23716
 
0.2%
75116
 
0.2%
Other values (2619)9803
98.0%
ValueCountFrequency (%)
15
0.1%
34
< 0.1%
42
 
< 0.1%
51
 
< 0.1%
78
0.1%
92
 
< 0.1%
102
 
< 0.1%
112
 
< 0.1%
125
0.1%
153
 
< 0.1%
ValueCountFrequency (%)
37831
< 0.1%
37751
< 0.1%
37681
< 0.1%
37611
< 0.1%
37591
< 0.1%
37571
< 0.1%
37481
< 0.1%
37441
< 0.1%
37361
< 0.1%
37241
< 0.1%

AADHAR_FLAG
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1
8397 
0
1603 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

Length

2021-08-02T12:56:55.887098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:55.939957image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

Most occurring characters

ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

Most occurring scripts

ValueCountFrequency (%)
Common10000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII10000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18397
84.0%
01603
 
16.0%

PAN_FLAG
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
9241 
1
 
759

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

Length

2021-08-02T12:56:56.075626image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:56.134470image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

Most occurring characters

ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
Common10000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII10000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
09241
92.4%
1759
 
7.6%

VOTERID_FLAG
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
8531 
1
1469 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

Length

2021-08-02T12:56:56.274096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:56.328949image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

Most occurring characters

ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

Most occurring scripts

ValueCountFrequency (%)
Common10000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII10000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
08531
85.3%
11469
 
14.7%

PASSPORT_FLAG
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
9970 
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

Length

2021-08-02T12:56:56.461563image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:56.517413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

Most occurring characters

ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common10000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII10000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
09970
99.7%
130
 
0.3%

PERFORM_CNS_SCORE
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct485
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean295.1243
Minimum0
Maximum890
Zeros4911
Zeros (%)49.1%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:56.592245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median15
Q3680
95-th percentile825
Maximum890
Range890
Interquartile range (IQR)680

Descriptive statistics

Standard deviation339.4487016
Coefficient of variation (CV)1.150188926
Kurtosis-1.658707342
Mean295.1243
Median Absolute Deviation (MAD)15
Skewness0.4151665359
Sum2951243
Variance115225.421
MonotonicityNot monotonic
2021-08-02T12:56:56.697961image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04911
49.1%
300385
 
3.9%
738352
 
3.5%
825339
 
3.4%
17170
 
1.7%
15158
 
1.6%
763134
 
1.3%
16131
 
1.3%
70888
 
0.9%
73782
 
0.8%
Other values (475)3250
32.5%
ValueCountFrequency (%)
04911
49.1%
1438
 
0.4%
15158
 
1.6%
16131
 
1.3%
17170
 
1.7%
1874
 
0.7%
300385
 
3.9%
3011
 
< 0.1%
3021
 
< 0.1%
3053
 
< 0.1%
ValueCountFrequency (%)
8901
 
< 0.1%
8795
 
0.1%
8581
 
< 0.1%
8532
 
< 0.1%
8521
 
< 0.1%
8501
 
< 0.1%
8491
 
< 0.1%
84526
0.3%
8443
 
< 0.1%
8394
 
< 0.1%

PERFORM_CNS_SCORE_DESCRIPTION
Categorical

HIGH CORRELATION

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
No Bureau History Available
4911 
C-Very Low Risk
688 
A-Very Low Risk
632 
D-Very Low Risk
513 
B-Very Low Risk
 
405
Other values (14)
2851 

Length

Max length55
Median length27
Mean length22.1557
Min length10

Characters and Unicode

Total characters221557
Distinct characters48
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Bureau History Available
2nd rowNo Bureau History Available
3rd rowNo Bureau History Available
4th rowNo Bureau History Available
5th rowE-Low Risk

Common Values

ValueCountFrequency (%)
No Bureau History Available4911
49.1%
C-Very Low Risk688
 
6.9%
A-Very Low Risk632
 
6.3%
D-Very Low Risk513
 
5.1%
B-Very Low Risk405
 
4.0%
M-Very High Risk385
 
3.9%
K-High Risk377
 
3.8%
F-Low Risk371
 
3.7%
H-Medium Risk286
 
2.9%
E-Low Risk254
 
2.5%
Other values (9)1178
 
11.8%

Length

2021-08-02T12:56:56.905409image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available5313
14.8%
no5116
14.3%
history5069
14.1%
bureau4911
13.7%
risk4518
12.6%
low2238
 
6.2%
not899
 
2.5%
c-very688
 
1.9%
a-very632
 
1.8%
scored571
 
1.6%
Other values (29)5902
16.5%

Most occurring characters

ValueCountFrequency (%)
25857
 
11.7%
i17180
 
7.8%
a16174
 
7.3%
o15730
 
7.1%
e15202
 
6.9%
r13592
 
6.1%
u11005
 
5.0%
l10738
 
4.8%
s10241
 
4.6%
y7902
 
3.6%
Other values (38)77936
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter151438
68.4%
Uppercase Letter38763
 
17.5%
Space Separator25857
 
11.7%
Dash Punctuation4518
 
2.0%
Other Punctuation571
 
0.3%
Decimal Number148
 
0.1%
Open Punctuation131
 
0.1%
Close Punctuation131
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i17180
11.3%
a16174
10.7%
o15730
10.4%
e15202
10.0%
r13592
9.0%
u11005
7.3%
l10738
7.1%
s10241
 
6.8%
y7902
 
5.2%
t7381
 
4.9%
Other values (12)26293
17.4%
Uppercase Letter
ValueCountFrequency (%)
H6336
16.3%
N6015
15.5%
A5832
15.0%
B5316
13.7%
R4518
11.7%
L3062
7.9%
V2664
6.9%
M901
 
2.3%
S729
 
1.9%
C688
 
1.8%
Other values (9)2702
7.0%
Decimal Number
ValueCountFrequency (%)
374
50.0%
674
50.0%
Space Separator
ValueCountFrequency (%)
25857
100.0%
Dash Punctuation
ValueCountFrequency (%)
-4518
100.0%
Other Punctuation
ValueCountFrequency (%)
:571
100.0%
Open Punctuation
ValueCountFrequency (%)
(131
100.0%
Close Punctuation
ValueCountFrequency (%)
)131
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin190201
85.8%
Common31356
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i17180
 
9.0%
a16174
 
8.5%
o15730
 
8.3%
e15202
 
8.0%
r13592
 
7.1%
u11005
 
5.8%
l10738
 
5.6%
s10241
 
5.4%
y7902
 
4.2%
t7381
 
3.9%
Other values (31)65056
34.2%
Common
ValueCountFrequency (%)
25857
82.5%
-4518
 
14.4%
:571
 
1.8%
(131
 
0.4%
)131
 
0.4%
374
 
0.2%
674
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII221557
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25857
 
11.7%
i17180
 
7.8%
a16174
 
7.3%
o15730
 
7.1%
e15202
 
6.9%
r13592
 
6.1%
u11005
 
5.0%
l10738
 
4.8%
s10241
 
4.6%
y7902
 
3.6%
Other values (38)77936
35.2%

PRI_OVERDUE_ACCTS
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1586
Minimum0
Maximum12
Zeros8846
Zeros (%)88.5%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:56.982171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5428402229
Coefficient of variation (CV)3.422700018
Kurtosis84.73015225
Mean0.1586
Median Absolute Deviation (MAD)0
Skewness6.742497825
Sum1586
Variance0.2946755076
MonotonicityNot monotonic
2021-08-02T12:56:57.068938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
08846
88.5%
1877
 
8.8%
2199
 
2.0%
350
 
0.5%
412
 
0.1%
56
 
0.1%
73
 
< 0.1%
62
 
< 0.1%
122
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
08846
88.5%
1877
 
8.8%
2199
 
2.0%
350
 
0.5%
412
 
0.1%
56
 
0.1%
62
 
< 0.1%
73
 
< 0.1%
81
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
122
 
< 0.1%
92
 
< 0.1%
81
 
< 0.1%
73
 
< 0.1%
62
 
< 0.1%
56
 
0.1%
412
 
0.1%
350
 
0.5%
2199
 
2.0%
1877
8.8%

NEW_ACCTS_IN_LAST_SIX_MONTHS
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.39
Minimum0
Maximum19
Zeros7755
Zeros (%)77.5%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:57.147727image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9605166626
Coefficient of variation (CV)2.462863238
Kurtosis38.3000325
Mean0.39
Median Absolute Deviation (MAD)0
Skewness4.563673903
Sum3900
Variance0.9225922592
MonotonicityNot monotonic
2021-08-02T12:56:57.228511image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
07755
77.5%
11365
 
13.7%
2508
 
5.1%
3189
 
1.9%
492
 
0.9%
536
 
0.4%
627
 
0.3%
713
 
0.1%
86
 
0.1%
94
 
< 0.1%
Other values (4)5
 
0.1%
ValueCountFrequency (%)
07755
77.5%
11365
 
13.7%
2508
 
5.1%
3189
 
1.9%
492
 
0.9%
536
 
0.4%
627
 
0.3%
713
 
0.1%
86
 
0.1%
94
 
< 0.1%
ValueCountFrequency (%)
191
 
< 0.1%
161
 
< 0.1%
112
 
< 0.1%
101
 
< 0.1%
94
 
< 0.1%
86
 
0.1%
713
 
0.1%
627
 
0.3%
536
 
0.4%
492
0.9%

AVERAGE_ACCT_AGE
Categorical

HIGH CARDINALITY

Distinct112
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0yrs 0mon
5034 
0yrs 6mon
 
249
0yrs 9mon
 
237
0yrs 7mon
 
223
0yrs 8mon
 
222
Other values (107)
4035 

Length

Max length11
Median length9
Mean length9.0742
Min length9

Characters and Unicode

Total characters90742
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.2%

Sample

1st row0yrs 0mon
2nd row0yrs 0mon
3rd row0yrs 0mon
4th row0yrs 0mon
5th row1yrs 7mon

Common Values

ValueCountFrequency (%)
0yrs 0mon5034
50.3%
0yrs 6mon249
 
2.5%
0yrs 9mon237
 
2.4%
0yrs 7mon223
 
2.2%
0yrs 8mon222
 
2.2%
0yrs 10mon218
 
2.2%
1yrs 0mon213
 
2.1%
0yrs 11mon210
 
2.1%
0yrs 5mon207
 
2.1%
1yrs 1mon190
 
1.9%
Other values (102)2997
30.0%

Length

2021-08-02T12:56:57.444932image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0yrs7153
35.8%
0mon5414
27.1%
1yrs1600
 
8.0%
2yrs664
 
3.3%
6mon467
 
2.3%
5mon438
 
2.2%
2mon435
 
2.2%
4mon433
 
2.2%
1mon425
 
2.1%
7mon420
 
2.1%
Other values (17)2551
 
12.8%

Most occurring characters

ValueCountFrequency (%)
012958
14.3%
y10000
11.0%
r10000
11.0%
s10000
11.0%
10000
11.0%
m10000
11.0%
o10000
11.0%
n10000
11.0%
13113
 
3.4%
21102
 
1.2%
Other values (7)3569
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter60000
66.1%
Decimal Number20742
 
22.9%
Space Separator10000
 
11.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012958
62.5%
13113
 
15.0%
21102
 
5.3%
3716
 
3.5%
4566
 
2.7%
5508
 
2.4%
6508
 
2.4%
7437
 
2.1%
8427
 
2.1%
9407
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
y10000
16.7%
r10000
16.7%
s10000
16.7%
m10000
16.7%
o10000
16.7%
n10000
16.7%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin60000
66.1%
Common30742
33.9%

Most frequent character per script

Common
ValueCountFrequency (%)
012958
42.2%
10000
32.5%
13113
 
10.1%
21102
 
3.6%
3716
 
2.3%
4566
 
1.8%
5508
 
1.7%
6508
 
1.7%
7437
 
1.4%
8427
 
1.4%
Latin
ValueCountFrequency (%)
y10000
16.7%
r10000
16.7%
s10000
16.7%
m10000
16.7%
o10000
16.7%
n10000
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII90742
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012958
14.3%
y10000
11.0%
r10000
11.0%
s10000
11.0%
10000
11.0%
m10000
11.0%
o10000
11.0%
n10000
11.0%
13113
 
3.4%
21102
 
1.2%
Other values (7)3569
 
3.9%

CREDIT_HISTORY_LENGTH
Categorical

HIGH CARDINALITY

Distinct192
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0yrs 0mon
5016 
0yrs 6mon
 
227
2yrs 1mon
 
187
2yrs 0mon
 
173
0yrs 7mon
 
171
Other values (187)
4226 

Length

Max length11
Median length9
Mean length9.0914
Min length9

Characters and Unicode

Total characters90914
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.3%

Sample

1st row0yrs 0mon
2nd row0yrs 0mon
3rd row0yrs 0mon
4th row0yrs 0mon
5th row3yrs 3mon

Common Values

ValueCountFrequency (%)
0yrs 0mon5016
50.2%
0yrs 6mon227
 
2.3%
2yrs 1mon187
 
1.9%
2yrs 0mon173
 
1.7%
0yrs 7mon171
 
1.7%
1yrs 0mon138
 
1.4%
0yrs 11mon125
 
1.2%
1yrs 1mon120
 
1.2%
0yrs 9mon115
 
1.1%
1yrs 6mon103
 
1.0%
Other values (182)3625
36.2%

Length

2021-08-02T12:56:57.660372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0yrs6283
31.4%
0mon5536
27.7%
1yrs1129
 
5.6%
2yrs969
 
4.8%
1mon540
 
2.7%
3yrs535
 
2.7%
6mon509
 
2.5%
7mon440
 
2.2%
2mon434
 
2.2%
3mon412
 
2.1%
Other values (24)3213
16.1%

Most occurring characters

ValueCountFrequency (%)
012202
13.4%
y10000
11.0%
r10000
11.0%
s10000
11.0%
10000
11.0%
m10000
11.0%
o10000
11.0%
n10000
11.0%
13015
 
3.3%
21439
 
1.6%
Other values (7)4258
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter60000
66.0%
Decimal Number20914
 
23.0%
Space Separator10000
 
11.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012202
58.3%
13015
 
14.4%
21439
 
6.9%
3978
 
4.7%
4714
 
3.4%
6649
 
3.1%
5598
 
2.9%
7511
 
2.4%
9409
 
2.0%
8399
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
y10000
16.7%
r10000
16.7%
s10000
16.7%
m10000
16.7%
o10000
16.7%
n10000
16.7%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin60000
66.0%
Common30914
34.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012202
39.5%
10000
32.3%
13015
 
9.8%
21439
 
4.7%
3978
 
3.2%
4714
 
2.3%
6649
 
2.1%
5598
 
1.9%
7511
 
1.7%
9409
 
1.3%
Latin
ValueCountFrequency (%)
y10000
16.7%
r10000
16.7%
s10000
16.7%
m10000
16.7%
o10000
16.7%
n10000
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII90914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012202
13.4%
y10000
11.0%
r10000
11.0%
s10000
11.0%
10000
11.0%
m10000
11.0%
o10000
11.0%
n10000
11.0%
13015
 
3.3%
21439
 
1.6%
Other values (7)4258
 
4.7%

NO_OF_INQUIRIES
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2075
Minimum0
Maximum18
Zeros8673
Zeros (%)86.7%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2021-08-02T12:56:57.743135image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum18
Range18
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7250491843
Coefficient of variation (CV)3.494212936
Kurtosis109.0026748
Mean0.2075
Median Absolute Deviation (MAD)0
Skewness7.801653559
Sum2075
Variance0.5256963196
MonotonicityNot monotonic
2021-08-02T12:56:57.825914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
08673
86.7%
1951
 
9.5%
2224
 
2.2%
372
 
0.7%
434
 
0.3%
518
 
0.2%
711
 
0.1%
95
 
0.1%
65
 
0.1%
83
 
< 0.1%
Other values (4)4
 
< 0.1%
ValueCountFrequency (%)
08673
86.7%
1951
 
9.5%
2224
 
2.2%
372
 
0.7%
434
 
0.3%
518
 
0.2%
65
 
0.1%
711
 
0.1%
83
 
< 0.1%
95
 
0.1%
ValueCountFrequency (%)
181
 
< 0.1%
171
 
< 0.1%
131
 
< 0.1%
101
 
< 0.1%
95
 
0.1%
83
 
< 0.1%
711
 
0.1%
65
 
0.1%
518
0.2%
434
0.3%

LOAN_DEFAULT
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
6650 
1
3350 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Length

2021-08-02T12:56:58.002441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-02T12:56:58.058292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Most occurring characters

ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Most occurring scripts

ValueCountFrequency (%)
Common10000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII10000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
06650
66.5%
13350
33.5%

Interactions

2021-08-02T12:56:29.721140image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:29.824863image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:31.508361image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:31.615076image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:31.718798image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:31.818532image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:31.922254image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.031960image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.142665image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.242398image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.349147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.448846image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.549577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.653299image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.753032image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.855757image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:32.968457image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.073177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.178894image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.293453image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.411170image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.511900image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.621577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.718346image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.835636image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:33.944861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.055534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.175244image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.382693image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.475440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.573148image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.661941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.750706image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.837471image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:34.928199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.022944image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.123706image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.211440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.305220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.392954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.484740image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.577492image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.670244image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.767982image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.868713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:35.963463image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.059204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.154948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.254683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.359370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.470111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.561860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.660599image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.754345image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.849100image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:36.948810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.054542image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.148292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.247027image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.341774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.440510image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.654940image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.759627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.852408image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:37.949120image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.046857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.151608image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.248352image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.349084image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.451774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.553564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.657257image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.760980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.860716image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:38.962440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.062171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.164898image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.263634image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.367357image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.465096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.568789image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.667556image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.772272image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.876000image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:39.979717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.075432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.172204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.264925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.370670image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.473400image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.578117image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.676826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.785534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.879314image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:40.985997image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.090718image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.188488image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.294205image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.396898image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.507602image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.617309image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.859661image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:41.964381image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.065111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.171826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.273554image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.382263image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.484988image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.592700image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.692434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.798151image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:42.902871image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.005596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.097351image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.190103image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.279863image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.372615image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.461409image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.556155image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.646913image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.741659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.825438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:43.916162image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.011910image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.123806image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.218555image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.314298image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.413072image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.520774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.617517image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.719245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.821971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:44.923699image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.017416image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.123134image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.220873image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.324595image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.421336image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.519075image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.620803image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.724526image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.814286image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.908035image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:45.999789image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.106255image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.211972image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.310708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.401498image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.497241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.592986image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.694713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:46.953989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.053751image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.166420image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.272137image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.379880image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.496536image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.590286image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.690044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.786760image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.888489image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:47.985230image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.092942image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.188686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.290414image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.384168image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.484893image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.592605image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.696328image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.805037image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:48.915741image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.015475image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.126179image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.230899image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.344627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.449346image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.562045image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.663774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.773480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.873213image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:49.978899image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.095587image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.214270image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.326003image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.437671image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.540397image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.643122image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.748839image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.854557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:50.955287image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.062002image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.161735image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.264460image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.360204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.459938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-02T12:56:51.566652image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-08-02T12:56:58.140073image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-02T12:56:58.351508image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-02T12:56:58.566963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-02T12:56:58.788374image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-02T12:56:59.306954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-02T12:56:51.799031image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-02T12:56:52.221900image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-08-02T12:56:52.407435image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexDISBURSED_AMOUNTASSET_COSTLTVBRANCH_IDSUPPLIER_IDMANUFACTURER_IDCURRENT_PINCODE_IDDATE_OF_BIRTHEMPLOYMENT_TYPEDISBURSAL_DATESTATE_IDEMPLOYEE_CODE_IDAADHAR_FLAGPAN_FLAGVOTERID_FLAGPASSPORT_FLAGPERFORM_CNS_SCOREPERFORM_CNS_SCORE_DESCRIPTIONPRI_OVERDUE_ACCTSNEW_ACCTS_IN_LAST_SIX_MONTHSAVERAGE_ACCT_AGECREDIT_HISTORY_LENGTHNO_OF_INQUIRIESLOAN_DEFAULT
041225403947666054.791180804550831974-01-01Salaried2018-10-123161610000No Bureau History Available000yrs 0mon0yrs 0mon00
127185394396539061.17105157924512181971-01-01Salaried2018-10-13693710000No Bureau History Available000yrs 0mon0yrs 0mon00
261563488497720064.7710227428656621991-02-27Self employed2018-08-293174610000No Bureau History Available000yrs 0mon0yrs 0mon01
37402470498208659.331223158650921997-01-01Self employed2018-10-25313610000No Bureau History Available000yrs 0mon0yrs 0mon01
41000560076735185.005156638633731984-03-01Self employed2018-09-0399631000686E-Low Risk101yrs 7mon3yrs 3mon00
584578711179200078.8076172424844671994-06-22Self employed2018-09-20822010000No Bureau History Available000yrs 0mon0yrs 0mon01
687655539636842981.8416181504529431981-05-21Self employed2018-09-17172346100017Not Scored: Not Enough Info available on the customer000yrs 5mon0yrs 5mon01
7119045575597115183.6216183178629591987-02-15Salaried2018-09-201420751000640G-Low Risk001yrs 5mon2yrs 4mon00
834691497637361670.6414217315861091994-10-08Salaried2018-08-24163311000525J-High Risk130yrs 10mon2yrs 6mon00
928616616479461068.7011233994856331980-03-05Self employed2018-08-203139510000No Bureau History Available000yrs 0mon0yrs 0mon00

Last rows

df_indexDISBURSED_AMOUNTASSET_COSTLTVBRANCH_IDSUPPLIER_IDMANUFACTURER_IDCURRENT_PINCODE_IDDATE_OF_BIRTHEMPLOYMENT_TYPEDISBURSAL_DATESTATE_IDEMPLOYEE_CODE_IDAADHAR_FLAGPAN_FLAGVOTERID_FLAGPASSPORT_FLAGPERFORM_CNS_SCOREPERFORM_CNS_SCORE_DESCRIPTIONPRI_OVERDUE_ACCTSNEW_ACCTS_IN_LAST_SIX_MONTHSAVERAGE_ACCT_AGECREDIT_HISTORY_LENGTHNO_OF_INQUIRIESLOAN_DEFAULT
999057953507036347081.9367181668615111983-03-29Salaried2018-08-2169491000738C-Very Low Risk020yrs 4mon0yrs 4mon00
999161098449986999865.2165232464968081971-01-14Salaried2018-10-24131991000738C-Very Low Risk011yrs 0mon1yrs 7mon10
999212920498036730276.5216220044529761974-05-12Salaried2018-08-29141941000635G-Low Risk004yrs 0mon8yrs 5mon00
999382518559596988485.86138171754533291982-06-01Self employed2018-10-05921741000719D-Very Low Risk011yrs 3mon3yrs 2mon10
999485209521357025075.70138174085133491976-06-24Self employed2018-10-29926901000763B-Very Low Risk001yrs 3mon3yrs 1mon10
9995716866391312235453.1236246224865601990-01-01Salaried2018-10-251372400100No Bureau History Available000yrs 0mon0yrs 0mon00
999681522485786771674.5874234245126151970-01-01Self employed2018-09-264315910100No Bureau History Available000yrs 0mon0yrs 0mon00
999731944668828700078.1622404012017321985-01-01Salaried2018-09-18426551000737C-Very Low Risk000yrs 11mon0yrs 11mon00
999867406513287725068.93254232925167571977-06-12Self employed2018-10-0313859011017Not Scored: Not Enough Info available on the customer004yrs 2mon4yrs 2mon01
999983835545136492186.2661219118613691975-06-01Self employed2018-08-17616551000763B-Very Low Risk000yrs 11mon0yrs 11mon00